Fixed Frame Temporal Pooling

نویسندگان

John Thornton

Linda Main

Andrew Srbic

چکیده

Applications of unsupervised learning techniques to action recognition have proved highly competitive in comparison to supervised and hand-crafted approaches, despite not being designed to handle image processing problems. Many of these techniques are either based on biological models of cognition or have responses that correlate to those observed in biological systems. In this study we apply (for the first time) an adaptation of the latest hierarchical temporal memory (HTM) cortical learning algorithms (CLAs) to the problem of action recognition. These HTM algorithms are both unsupervised and represent one of the most complete high-level syntheses available of the current neuroscientific understanding of the functioning of neocortex. Specifically, we extend the latest HTM work on augmented spatial pooling, to produce a fixed frame temporal pooler (FFTP). This pooler is evaluated on the well-known KTH action recognition data set and in comparison with the best performing unsupervised learning algorithm for bag-of-features classification in the area: independent subspace analysis (ISA). Our results show FFTP comes within 2% of ISA’s performance and outperforms other comparable techniques on this data set. We take these results to be promising, given the preliminary nature of the research and that the FFTP algorithm is only a partial implementation of the proposed HTM architecture.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition

Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently much effort is spent on applying CNNs to video based action recognition problems. One challenge is that video contains a varying number of frames which is incompatible to the standard input format of CNNs. Existing methods handle this issue either by directly sampling a fixed number of frames or ...

متن کامل

Order-aware Convolutional Pooling for Video Based Action Recognition

Most video based action recognition approaches create the video-level representation by temporally pooling the features extracted at each frame. The pooling methods that they adopt, however, usually completely or partially neglect the dynamic information contained in the temporal domain, which may undermine the discriminative power of the resulting video representation since the video sequence ...

متن کامل

Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification

We propose a simple, yet novel, multi-layer model for the problem of phonetic classification. Our model combines the frame level transformation of the acoustic signal with the segment level transformation via a temporal pooling architecture to compute class conditional probabilities of phones. Without the use of any phonetic knowledge, our model achieved the state-ofthe-art performance on the T...

متن کامل

Second-order Temporal Pooling for Action Recognition

Most successful deep learning models for action recognition generate predictions for short video clips, which are later aggregated into a longer time-frame action descriptor by computing a statistic over these predictions. Zeroth (max) or first order (average) statistic are commonly used. In this paper, we explore the benefits of using second-order statistics. Specifically, we propose a novel e...

متن کامل

Learning End-to-end Video Classification with Rank-Pooling

We introduce a new model for representation learning and classification of video sequences. Our model is based on a convolutional neural network coupled with a novel temporal pooling layer. The temporal pooling layer relies on an inner-optimization problem to efficiently encode temporal semantics over arbitrarily long video clips into a fixed-length vector representation. Importantly, the repre...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Fixed Frame Temporal Pooling

نویسندگان

چکیده

منابع مشابه

Temporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition

Order-aware Convolutional Pooling for Video Based Action Recognition

Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification

Second-order Temporal Pooling for Action Recognition

Learning End-to-end Video Classification with Rank-Pooling

عنوان ژورنال:

اشتراک گذاری